Skip to main content

Sorting

Resources

Given the array [2,3,4,1,6], our goal is to sort the array so that it looks like [1,2,3,4,6].

I. Insertion Sort

Insertion sort works by taking each element from the unsorted portion and inserting it into its correct position in the already sorted portion of the array.

1. Concept & Implementation

  1. Start with the second element (at index 1), assuming the first element is already sorted
  2. Compare the current element with the previous elements
  3. If the previous element is greater than the current element, move the previous element one position ahead
  4. Repeat step 3 until you find the correct position for the current element
  5. Insert the current element at the correct position
  6. Repeat steps 2-5 for all elements in the array
Example":
  1. The i pointer points at the element we are currently inserting into the sorted portion of the array.
  2. The j pointer starts off one index to the left of i.
  3. Our goal is to find the position that arr[i] should be inserted into the sorted portion of the array.
  4. We continue swapping it with arr[j] until we find the correct position.
  5. After each swap we shift j to the left by 1
  6. We stop once the element is greater than or equal to the element to its left.
def insertion_sort(arr):
for i in range(1, len(arr)):
j = i - 1
while j >= 0 and arr[j] > arr[j + 1]: # the previous num is bigger than the next num
# we swap them
temp = arr[j]
arr[j] = arr[j + 1]
arr[j + 1] = temp
j -= 1 # we go back the array, we stop when we hit the start of the array j = 0

return arr

2. Stability

Stability in sorting algorithms means that elements with equal keys (values being compared) will maintain their original relative order in the sorted output.

Insertion sort is stable, meaning that it is guaranteed that the relative order will remain the same.

Example:
  • Imagine you have these cards, where the number is what we're sorting by.
2♥, 5♠, 5♥, 1♦, 3♣
  • After sorting by number using a stable sort, we have:
1♦, 2♥, 3♣, 5♠, 5♥

→ Notice the two 5s - the spade (♠) is still before the heart (♥) because that was their original order before sorting.

  • If we had used an unstable sort, the 5s might have swapped positions, giving:
1♦, 2♥, 3♣, 5♥, 5♠

Summary

a. Time Complexity

  • If the input array is already sorted you may notice that the inner while loop will never execute. This is the best case scenario for insertion sort where the time complexity is O(n).
    • The more sorted the input array is, the faster insertion sort runs, potentially approaching O(n) time.
  • In the worst case scenario, the time complexity is O(n^2). This is because the inner while loop will execute n times for each element in the array.
  • The input leading to the worst case scenario is when the array is in reverse order. The outer loop will iterate through the entire array and the inner loop will execute n times for each element in the array.

Even if the array is in random order, the time complexity is still O(n^2) in the average case.

b. Space Complexity

Insertion sort is an in-place sorting algorithm. This means that the space complexity is O(1) since no additional data structures are used.

II. Merge Sort

1. Concept & Implementation

Merge sort is a divide-and-conquer algorithm that works by:

  • Splitting the unsorted array into halves until the subarrays hit a size of 1.
  • Then recursively sort the subarrays by merging two subarrays at a time. The final array will be fully sorted.
  1. We have a base case where if the length of the array is less than or equal to 1, we return the array because it is already sorted.
  2. Otherwise we calculate the middle index of the array.
  3. We recursively call mergeSort() on the left half of the array. We do this by passing pointers start and middle to the function, which in this case represent the start and end of the left half of the array.
  4. We recursively call mergeSort() on the right half of the array. We do this by passing pointers middle + 1 and end to the function, which in this case represent the start and end of the right half of the array.
  5. We merge the two sorted halves by calling the merge() function, which is discussed more below.
def merge_sort(arr, start, end):
# Base case: if array has 0 or 1 elements, it's already sorted
if end - start + 1 <= 1:
return arr

# Find the middle point to divide the array
middle = (start + end) // 2

# Sort the left half
merge_sort(arr, start, middle)

# Sort the right half
merge_sort(arr, middle + 1, end)

# Merge the sorted halves
merge(arr, start, middle, end)

return arr

Example: Let's take an array of size 5 as an example, [3,2,4,1,6]. We want to sort it in an increasing, or non-decreasing order if we had duplicates.

  1. We will be splitting the array until we have single elements.

  2. Then, we merge them back together in sorted order.

  • Sort and merge the right half.

  • Sort and merge the left half.

merge() function

The heart of merge sort is the merge() function, which combines two sorted arrays into one.

The merge operation uses three pointers:

  • i: tracks position in left subarray.
  • j: tracks position in right subarray.
  • k: tracks where to place next value in the original array.
def merge(arr, start, middle, end):
# Create temporary arrays to hold the split data
left_array = arr[start:middle + 1]
right_array = arr[middle + 1:end + 1]

i = 0 # Index for left subarray
j = 0 # Index for right subarray
k = start # Index for merged array

# Compare elements from both subarrays and place the smaller one first
while i < len(left_array) and j < len(left_array):
if L[i] <= R[j]:
arr[k] = left_array[i]
i += 1
else:
arr[k] = right_array[j]
j += 1
k += 1

# Copy any remaining elements from left subarray
while i < len(left_array):
arr[k] = left_array[i]
i += 1
k += 1

# Copy any remaining elements from right subarray
while j < len(right_array):
arr[k] = right_array[j]
j += 1
k += 1

2. Stability

Merge sort is stable, meaning it preserves the relative order of equal elements.

Example:
  • If we have [5A, 2, 5B, 1] (where 5A and 5B represent the same value with different identities)
  • After sorting: [1, 2, 5A, 5B] - the 5A still comes before 5B

This stability comes from the merge() function's comparison:

if L[i] <= R[j]:  # Using <= preserves order of equal elements

Summary

a. Time Complexity

Merge Sort runs in O(n log n).

  • If n is the length of our array at any given level, our subarrays in the next level have a length n/2.
  • The number of times we divide n by 2 until we hit the base case is n / 2^x = 1, where x is the number of times we need to divide n by two until we get to one.
    • Solve this and we get x = log n.
    • This means we have log n levels in our recursion tree.
  • For each level, merge() will take n steps because at any level of the tree, we have n elements to compare and sort - where n is the length of the original array.
  • To get the time complexity: number of levels x cost of each level = n * log n. → This result in a O(n log n) time complexity.

b. Space

The height of the recursion tree is log n at each level. At any given level, we have n elements to sort, which we will allocate temporary arrays for in the merge() function.

→ Total Space Complexity = O(n + log n), simplifies to O(n).

c. Comparison with Insertion Sort

  1. Insertion Sort:
    • Best case: O(n) when nearly sorted
    • Worst case: O(n²)
    • In-place with O(1) extra space
    • Good for small or nearly sorted arrays
  2. Merge Sort:
    • Always O(n log n) performance
    • Requires O(n) extra space
    • More efficient for large arrays
    • Stable sort algorithm